** Data reading and variable selection from raw data
** East Asian Social Survey - South Korea

** 01. Reading data **

cap log close
clear all
set more off
cd /*insert you work directory here*/
use /*read your data here*/ 

** 01.1. Selecting only South Korea **

keep if V2 == 3

** 02. Consructing year and country variables **

ge year=2006
lab var year "survey year"

ge country=410
lab var country "ISO country code"
//Korea: 410 (ISO Country Codes) 


** 03. ID variables **

ge pid=V3
lab var pid "person id"

** 04. Basic Demographics (Sex and Age/birth year) **

ge sex=SEX
lab var sex "sex"
lab def sex 1 "male" 2 "female"
lab val sex sex

ge age=AGE
lab var age "age"

ge birthyr = year - age

** 05. Siblings **

* number of brothers/sisters does not include respondent. Original measure includes respondent, so we subtract one.

ge nsibs=V10 - 1

* number of brothers and sisters variables. V5 and V6 capture older brother and sisters respectively, while V8 and V9 capture younger brothers and sisters*

ge nbro = V5 + V8
ge nsis = V6 + V9

ge birthorder= V5 + V6 + 1

lab var nbro "number of brothers"
lab var nsis "number of sisters"
lab var nsibs "number of siblings"
lab var birthorder "birth order"


** 06. Own education **

rename EDUCYRS educ_yrs

rename DEGREE educ_cat


//label respondent education

lab var educ_yrs "number of years of education completed"

lab var educ_cat "highest level of education completed"


** 07. Parents' education: Father and/or Mother **

rename PADEGR faeduc_cat

rename MADEGR moeduc_cat

rename PAEDYRS faeduc_yrs

rename MAEDYRS moeduc_yrs

lab var faeduc_cat "father's education level"

lab var moeduc_cat "mother's education level"

lab var faeduc_yrs "father's number of years of education completed"

lab var moeduc_yrs "mother's number of years of education completed"


** 08. Own occupation **

rename ISCO88 occ_ISCO

rename WRKST emp_stat

lab var occ_ISCO "current occupation_ISCO88, 4 digit"

lab var emp_stat "employment status"


** 09. Parents' occupation **

//not available

** 10. Tabulate the Identified Variables **

log using /*insert you work directory here*/, replace text

** Data reading and variable selection from raw data
** East Asian Social Survey - South Korea

** Sex **
tab sex

** Age, Birth Year **
sum age birthyr, d

** Siblings **
sum nsibs nsis nbro birthorder, d

** R's Own Education **
tab1 educ_cat educ_yrs

** Parental Education **
tab1 faeduc_cat moeduc_cat faeduc_yrs moeduc_yrs

** R's Own Occupation **
tab1 occ_ISCO emp_stat

log close

** 11. Keep the identified variables only

keep year country pid sex age birthyr ///
	 nsibs nsis nbro birthorder ///
	 educ_cat faeduc_cat moeduc_cat educ_yrs faeduc_yrs moeduc_yrs ///
	 emp_stat occ_ISCO


**Create ISCED Education Variable**


ge educ_ISCED = .
replace educ_ISCED = 000 if educ_yrs == 0 | educ_yrs == 98
replace educ_ISCED = 100 if educ_yrs > 0 & educ_yrs < 7
replace educ_ISCED = 200 if educ_yrs > 6 & educ_yrs < 11
replace educ_ISCED = 300 if educ_yrs > 10 & educ_yrs < 13
replace educ_ISCED = 400 if educ_yrs ==13
replace educ_ISCED = 500 if educ_yrs ==14
replace educ_ISCED = 600 if educ_yrs > 14 & educ_yrs < 17
replace educ_ISCED = 700 if educ_yrs == 17
replace educ_ISCED = 800 if educ_yrs > 17 & educ_yrs < 97
replace educ_ISCED = 300 if educ_yrs == 97

ge moeduc_ISCED = .
replace moeduc_ISCED = 000 if moeduc_yrs == 0 | moeduc_yrs == 98
replace moeduc_ISCED = 100 if moeduc_yrs > 0 & moeduc_yrs < 7
replace moeduc_ISCED = 200 if moeduc_yrs > 6 & moeduc_yrs < 11
replace moeduc_ISCED = 300 if moeduc_yrs > 10 & moeduc_yrs < 13
replace moeduc_ISCED = 400 if moeduc_yrs ==13
replace moeduc_ISCED = 500 if moeduc_yrs ==14
replace moeduc_ISCED = 600 if moeduc_yrs > 14 & moeduc_yrs < 17
replace moeduc_ISCED = 700 if moeduc_yrs == 17
replace moeduc_ISCED = 800 if moeduc_yrs > 17 & moeduc_yrs < 97
replace moeduc_ISCED = 300 if moeduc_yrs == 97

ge faeduc_ISCED = .
replace faeduc_ISCED = 000 if faeduc_yrs == 0 | faeduc_yrs == 98
replace faeduc_ISCED = 100 if faeduc_yrs > 0 & faeduc_yrs < 7
replace faeduc_ISCED = 200 if faeduc_yrs > 6 & faeduc_yrs < 11
replace faeduc_ISCED = 300 if faeduc_yrs > 10 & faeduc_yrs < 13
replace faeduc_ISCED = 400 if faeduc_yrs ==13
replace faeduc_ISCED = 500 if faeduc_yrs ==14
replace faeduc_ISCED = 600 if faeduc_yrs > 14 & faeduc_yrs < 17
replace faeduc_ISCED = 700 if faeduc_yrs == 17
replace faeduc_ISCED = 800 if faeduc_yrs > 17 & faeduc_yrs < 97
replace faeduc_ISCED = 300 if faeduc_yrs == 97


** 12. Save the Data File **

saveold /*insert you work directory here*/, replace
